Goto

Collaborating Authors

 network latency




AI-Driven Health Monitoring of Distributed Computing Architecture: Insights from XGBoost and SHAP

Sun, Xiaoxuan, Yao, Yue, Wang, Xiaoye, Li, Pochun, Li, Xuan

arXiv.org Artificial Intelligence

With the rapid development of artificial intelligence technology, its application in the optimization of complex computer systems is becoming more and more extensive. Edge computing is an efficient distributed computing architecture, and the health status of its nodes directly affects the performance and reliability of the entire system. In view of the lack of accuracy and interpretability of traditional methods in node health status judgment, this paper proposes a health status judgment method based on XGBoost and combines the SHAP method to analyze the interpretability of the model. Through experiments, it is verified that XGBoost has superior performance in processing complex features and nonlinear data of edge computing nodes, especially in capturing the impact of key features (such as response time and power consumption) on node status. SHAP value analysis further reveals the global and local importance of features, so that the model not only has high precision discrimination ability but also can provide intuitive explanations, providing data support for system optimization. Research shows that the combination of AI technology and computer system optimization can not only realize the intelligent monitoring of the health status of edge computing nodes but also provide a scientific basis for dynamic optimization scheduling, resource management and anomaly detection. In the future, with the in-depth development of AI technology, model dynamics, cross-node collaborative optimization and multimodal data fusion will become the focus of research, providing important support for the intelligent evolution of edge computing systems.


PEERNet: An End-to-End Profiling Tool for Real-Time Networked Robotic Systems

Narayanan, Aditya, Kasibhatla, Pranav, Choi, Minkyu, Li, Po-han, Zhao, Ruihan, Chinchali, Sandeep

arXiv.org Artificial Intelligence

Networked robotic systems balance compute, power, and latency constraints in applications such as self-driving vehicles, drone swarms, and teleoperated surgery. A core problem in this domain is deciding when to offload a computationally expensive task to the cloud, a remote server, at the cost of communication latency. Task offloading algorithms often rely on precise knowledge of system-specific performance metrics, such as sensor data rates, network bandwidth, and machine learning model latency. While these metrics can be modeled during system design, uncertainties in connection quality, server load, and hardware conditions introduce real-time performance variations, hindering overall performance. We introduce PEERNet, an end-to-end and real-time profiling tool for cloud robotics. PEERNet enables performance monitoring on heterogeneous hardware through targeted yet adaptive profiling of system components such as sensors, networks, deep-learning pipelines, and devices. We showcase PEERNet's capabilities through networked robotics tasks, such as image-based teleoperation of a Franka Emika Panda arm and querying vision language models using an Nvidia Jetson Orin. PEERNet reveals non-intuitive behavior in robotic systems, such as asymmetric network transmission and bimodal language model output. Our evaluation underscores the effectiveness and importance of benchmarking in networked robotics, demonstrating PEERNet's adaptability. Our code is open-source and available at github.com/UTAustin-SwarmLab/PEERNet.


NetROS-5G: Enhancing Personalization through 5G Network Slicing and Edge Computing in Human-Robot Interactions

Dalgkitsis, Anestis, Verikoukis, Christos

arXiv.org Artificial Intelligence

Robots are increasingly being used in a variety of applications, from manufacturing and healthcare to education and customer service. However, the mobility, power, and price points of these robots often dictate that they do not have sufficient computing power on board to run modern algorithms for personalization in human-robot interaction at desired rates. This can limit the effectiveness of the interaction and limit the potential applications for these robots. 5G connectivity provides a solution to this problem by offering high data rates, bandwidth, and low latency that can facilitate robotics services. Additionally, the widespread availability of cloud computing has made it easy to access almost unlimited computing power at a low cost. Edge computing, which involves placing compute resources closer to the action, can offer even lower latency than cloud computing. In this paper, we explore the potential of combining 5G, edge, and cloud computing to provide improved personalization in human-robot interaction. We design, develop, and demonstrate a new framework, entitled NetROS-5G, to show how the performance gained by utilizing these technologies can overcome network latency and significantly enhance personalization in robotics. Our results show that the integration of 5G network slicing, edge computing, and cloud computing can collectively offer a cost-efficient and superior level of personalization in a modern human-robot interaction scenario.


Safe Networked Robotics with Probabilistic Verification

Narasimhan, Sai Shankar, Bhat, Sharachchandra, Chinchali, Sandeep P.

arXiv.org Artificial Intelligence

Autonomous robots must utilize rich sensory data to make safe control decisions. To process this data, compute-constrained robots often require assistance from remote computation, or the cloud, that runs compute-intensive deep neural network perception or control models. However, this assistance comes at the cost of a time delay due to network latency, resulting in past observations being used in the cloud to compute the control commands for the present robot state. Such communication delays could potentially lead to the violation of essential safety properties, such as collision avoidance. This paper develops methods to ensure the safety of robots operated over communication networks with stochastic latency. To do so, we use tools from formal verification to construct a shield, i.e., a run-time monitor, that provides a list of safe actions for any delayed sensory observation, given the expected and maximum network latency. Our shield is minimally intrusive and enables networked robots to satisfy key safety constraints, expressed as temporal logic specifications, with desired probability. We demonstrate our approach on a real F1/10th autonomous vehicle that navigates in indoor environments and transmits rich LiDAR sensory data over congested WiFi links.


Adaptive DNN Surgery for Selfish Inference Acceleration with On-demand Edge Resource

Yang, Xiang, Chen, Dezhi, Qi, Qi, Wang, Jingyu, Sun, Haifeng, Liao, Jianxin, Guo, Song

arXiv.org Artificial Intelligence

Deep Neural Networks (DNNs) have significantly improved the accuracy of intelligent applications on mobile devices. DNN surgery, which partitions DNN processing between mobile devices and multi-access edge computing (MEC) servers, can enable real-time inference despite the computational limitations of mobile devices. However, DNN surgery faces a critical challenge: determining the optimal computing resource demand from the server and the corresponding partition strategy, while considering both inference latency and MEC server usage costs. This problem is compounded by two factors: (1) the finite computing capacity of the MEC server, which is shared among multiple devices, leading to inter-dependent demands, and (2) the shift in modern DNN architecture from chains to directed acyclic graphs (DAGs), which complicates potential solutions. In this paper, we introduce a novel Decentralized DNN Surgery (DDS) framework. We formulate the partition strategy as a min-cut and propose a resource allocation game to adaptively schedule the demands of mobile devices in an MEC environment. We prove the existence of a Nash Equilibrium (NE), and develop an iterative algorithm to efficiently reach the NE for each device. Our extensive experiments demonstrate that DDS can effectively handle varying MEC scenarios, achieving up to 1.25$\times$ acceleration compared to the state-of-the-art algorithm.


A friendly introduction to machine learning compilers and optimizers

#artificialintelligence

I have a confession to make. I became a machine learning engineer so that I wouldn't have to worry about compilers. However, as I learned more about bringing ML models into production, the topic of compilers kept coming up. In many use cases, especially when running an ML model on the edge, the model's success still depends on the hardware it runs on, which makes it important for people working with ML models in production to understand how their models are compiled and optimized to run on different hardware accelerators. Ideally, the compiler would be invisible and everything would "just work". However, we are still many years away from that. As more and more companies want to bring ML to the edge, and more and more hardware is being developed for ML models, more and more compilers are being developed to bridge the gap between ML models and hardware accelerators--MLIR dialects, TVM, XLA, PyTorch Glow, cuDNN, etc.. According to Soumith Chintala, creator of PyTorch, as the ML adoption matures, companies will compete on who can compile and optimize their models better. Understanding how compilers work can help you choose the right compiler to bring your models to your hardware of choice as well as diagnose performance issues and speed up your models.


Does Form Follow Function? An Empirical Exploration of the Impact of Deep Neural Network Architecture Design on Hardware-Specific Acceleration

Abbasi, Saad, Shafiee, Mohammad Javad, Chan, Ellick, Wong, Alexander

arXiv.org Artificial Intelligence

The fine-grained relationship between form and function with respect to deep neural network architecture design and hardware-specific acceleration is one area that is not well studied in the research literature, with form often dictated by accuracy as opposed to hardware function. In this study, a comprehensive empirical exploration is conducted to investigate the impact of deep neural network architecture design on the degree of inference speedup that can be achieved via hardware-specific acceleration. More specifically, we empirically study the impact of a variety of commonly used macro-architecture design patterns across different architectural depths through the lens of OpenVINO microprocessor-specific and GPU-specific acceleration. Experimental results showed that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern, with the greatest speedup achieved on the depthwise bottleneck convolution design pattern at 550%. Furthermore, we conduct an in-depth exploration of the correlation between FLOPs requirement, level 3 cache efficacy, and network latency with increasing architectural depth and width. Finally, we analyze the inference time reductions using hardware-specific acceleration when compared to native deep learning frameworks across a wide variety of hand-crafted deep convolutional neural network architecture designs as well as ones found via neural architecture search strategies. We found that the DARTS-derived architecture to benefit from the greatest improvement from hardware-specific software acceleration (1200%) while the depthwise bottleneck convolution-based MobileNet-V2 to have the lowest overall inference time of around 2.4 ms.


Q&A: Solving connected car challenges with edge AI (Includes interview)

#artificialintelligence

Over 72.5 million connected car units are estimated to be sold by 2023, enabling nearly 70% of all passenger vehicles to actively exchange data with external sources. The amount of data resulting from these smart vehicles will be overwhelming for traditional data processing solutions to gather and analyze, as well as the associated latency of processing this data-- leading to potential life-or-death scenarios, according to Ramya Ravichandar, from Foghorn. We speak with Ravichandar, about how connected car manufacturers are implementing edge AI solutions for real-time video recognition, multi-factor authentication, and other innovative capabilities to decrease network latency and optimize data gathering, analyzing and security. Digital Journal: What are the current trends with autonomous and connected cars? Ramya Ravichandar: Automotive companies are looking to improve real-time functionalities and accelerate autonomous operations of passenger vehicles. Connected vehicle technology is introducing a new dimension of transportation by extending vehicle operations and controls beyond the driver to include internal networks and systems.